Goto

Collaborating Authors

 abstract interpretation


Cost-Driven Synthesis of Sound Abstract Interpreters

Gu, Qiuhan, Singh, Avaljot, Singh, Gagandeep

arXiv.org Artificial Intelligence

Constructing abstract interpreters that provide global soundness guarantees remains a major obstacle in abstract interpretation. We investigate whether modern LLMs can reduce this burden by leveraging them to synthesize sound, non-trivial abstract interpreters across multiple abstract domains in the setting of neural network verification. We formulate synthesis as a constrained optimization problem and introduce a novel mathematically grounded cost function for measuring unsoundness under strict syntactic and semantic constraints. Based on this formulation, we develop a unified framework that unifies LLM-based generation with syntactic and semantic validation and a quantitative cost-guided feedback mechanism. Empirical results demonstrate that our framework not only matches the quality of handcrafted transformers, but more importantly, discovers sound, high-precision transformers for complex nonlinear operators that are absent from existing literature.


Can LLMs Formally Reason as Abstract Interpreters for Program Analysis?

Mitchell, Jacqueline L., Kim, Brian Hyeongseok, Zhou, Chenyu, Wang, Chao

arXiv.org Artificial Intelligence

LLMs have demonstrated impressive capabilities in code generation and comprehension, but their potential in being able to perform program analysis in a formal, automatic manner remains under-explored. To that end, we systematically investigate whether LLMs can reason about programs using a program analysis framework called abstract interpretation. We prompt LLMs to follow two different strategies, denoted as Compositional and Fixed Point Equation, to formally reason in the style of abstract interpretation, which has never been done before to the best of our knowledge. We validate our approach using state-of-the-art LLMs on 22 challenging benchmark programs from the Software Verification Competition (SV-COMP) 2019 dataset, widely used in program analysis. Our results show that our strategies are able to elicit abstract interpretation-based reasoning in the tested models, but LLMs are susceptible to logical errors, especially while interpreting complex program structures, as well as general hallucinations. This highlights key areas for improvement in the formal reasoning capabilities of LLMs.


C3: Learning Congestion Controllers with Formal Certificates

Yang, Chenxi, Saxena, Divyanshu, Dwivedula, Rohit, Mahajan, Kshiteej, Chaudhuri, Swarat, Akella, Aditya

arXiv.org Artificial Intelligence

Learning-based congestion controllers offer better adaptability compared to traditional heuristic algorithms. However, the inherent unreliability of learning techniques can cause learning-based controllers to behave poorly, creating a need for formal guarantees. While methods for formally verifying learned congestion controllers exist, these methods offer binary feedback that cannot optimize the controller toward better behavior. We improve this state-of-the-art via C3, a new learning framework for congestion control that integrates the concept of formal certification in the learning loop. C3 uses an abstract interpreter that can produce robustness and performance certificates to guide the training process, rewarding models that are robust and performant even on worst-case inputs. Our evaluation demonstrates that unlike state-of-the-art learned controllers, C3-trained controllers provide both adaptability and worst-case reliability across a range of network conditions.


Towards Compositional Interpretability for XAI

Tull, Sean, Lorenz, Robin, Clark, Stephen, Khan, Ilyas, Coecke, Bob

arXiv.org Artificial Intelligence

Artificial intelligence (AI) is currently based largely on black-box machine learning models which lack interpretability. The field of eXplainable AI (XAI) strives to address this major concern, being critical in high-stakes areas such as the finance, legal and health sectors. We present an approach to defining AI models and their interpretability based on category theory. For this we employ the notion of a compositional model, which sees a model in terms of formal string diagrams which capture its abstract structure together with its concrete implementation. This comprehensive view incorporates deterministic, probabilistic and quantum models. We compare a wide range of AI models as compositional models, including linear and rule-based models, (recurrent) neural networks, transformers, VAEs, and causal and DisCoCirc models. Next we give a definition of interpretation of a model in terms of its compositional structure, demonstrating how to analyse the interpretability of a model, and using this to clarify common themes in XAI. We find that what makes the standard 'intrinsically interpretable' models so transparent is brought out most clearly diagrammatically. This leads us to the more general notion of compositionally-interpretable (CI) models, which additionally include, for instance, causal, conceptual space, and DisCoCirc models. We next demonstrate the explainability benefits of CI models. Firstly, their compositional structure may allow the computation of other quantities of interest, and may facilitate inference from the model to the modelled phenomenon by matching its structure. Secondly, they allow for diagrammatic explanations for their behaviour, based on influence constraints, diagram surgery and rewrite explanations. Finally, we discuss many future directions for the approach, raising the question of how to learn such meaningfully structured models in practice.


C Analyzer : A Static Program Analysis Tool for C Programs

Solanki, Rajendra Kumar

arXiv.org Artificial Intelligence

In our times, when the world is increasingly becoming more dependent on software programs, writing bug-free, correct programs is crucial. Program verification based on formal methods can guarantee this by detecting run-time errors in safety-critical systems to avoid possible adverse impacts on human life and save time and money. This project work tries to leverage Abstract Interpretation techniques for static analysis of C programs. C Analyzer is a tool developed for static analysis of C programs. This implementation of C Analyzer provides a plug-and-play domain architecture for multiple abstract domains to be used. C Analyzer supports four abstract domains - Interval, Octagon, Polyhedra, and Bit Vector. We use these different domains for required precision in program verification. C Analyzer tool uses LLVM C/C++ compiler frontend Clang API to generate and traverse the Control Flow Graph (CFG) of a given C program. This tool generates invariants in different abstract domains for statements in basic blocks of CFG during CFG traversal. Using these invariants, some properties of a program, such as dividing by zero, modulus zero, arithmetic overflow, etc., can be analyzed. We also use a source-to-source transformation tool, CIL (Common Intermediate language), to transform some C constructs into simpler constructs, such as transforming logical operators, switch statements, and conditional operators into if-else ladders and transforming do-while and for loops into while loops. Using C Analyzer, C program constructs such as declarations, assignments, binary operations (arithmetic, relational, bitwise shift, etc.), conditions (if-else), loops (while, do while, for loop), nested conditions, and nested loops can be analyzed. Currently, this tool does not support arrays, structures, unions, pointers, or function calls.


Certifying the Fairness of KNN in the Presence of Dataset Bias

Li, Yannan, Wang, Jingbo, Wang, Chao

arXiv.org Artificial Intelligence

We propose a method for certifying the fairness of the classification result of a widely used supervised learning algorithm, the k-nearest neighbors (KNN), under the assumption that the training data may have historical bias caused by systematic mislabeling of samples from a protected minority group. To the best of our knowledge, this is the first certification method for KNN based on three variants of the fairness definition: individual fairness, $\epsilon$-fairness, and label-flipping fairness. We first define the fairness certification problem for KNN and then propose sound approximations of the complex arithmetic computations used in the state-of-the-art KNN algorithm. This is meant to lift the computation results from the concrete domain to an abstract domain, to reduce the computational cost. We show effectiveness of this abstract interpretation based technique through experimental evaluation on six datasets widely used in the fairness research literature. We also show that the method is accurate enough to obtain fairness certifications for a large number of test inputs, despite the presence of historical bias in the datasets.


Certifiably Robust Reinforcement Learning through Model-Based Abstract Interpretation

Yang, Chenxi, Anderson, Greg, Chaudhuri, Swarat

arXiv.org Artificial Intelligence

We present a reinforcement learning (RL) framework in which the learned policy comes with a machine-checkable certificate of provable adversarial robustness. Our approach, called CAROL, learns a model of the environment. In each learning iteration, it uses the current version of this model and an external abstract interpreter to construct a differentiable signal for provable robustness. This signal is used to guide learning, and the abstract interpretation used to construct it directly leads to the robustness certificate returned at convergence. We give a theoretical analysis that bounds the worst-case accumulative reward of CAROL. We also experimentally evaluate CAROL on four MuJoCo environments with continuous state and action spaces. On these tasks, CAROL learns policies that, when contrasted with policies from the state-of-the-art robust RL algorithms, exhibit: (i) markedly enhanced certified performance lower bounds; and (ii) comparable performance under empirical adversarial attacks.


Finding Minimum-Cost Explanations for Predictions made by Tree Ensembles

Törnblom, John, Karlsson, Emil, Nadjm-Tehrani, Simin

arXiv.org Artificial Intelligence

The ability to explain why a machine learning model arrives at a particular prediction is crucial when used as decision support by human operators of critical systems. The provided explanations must be provably correct, and preferably without redundant information, called minimal explanations. In this paper, we aim at finding explanations for predictions made by tree ensembles that are not only minimal, but also minimum with respect to a cost function. To this end, we first present a highly efficient oracle that can determine the correctness of explanations, surpassing the runtime performance of current state-of-the-art alternatives by several orders of magnitude when computing minimal explanations. Secondly, we adapt an algorithm called MARCO from related works (calling it m-MARCO) for the purpose of computing a single minimum explanation per prediction, and demonstrate an overall speedup factor of two compared to the MARCO algorithm which enumerates all minimal explanations. Finally, we study the obtained explanations from a range of use cases, leading to further insights of their characteristics. In particular, we observe that in several cases, there are more than 100,000 minimal explanations to choose from for a single prediction. In these cases, we see that only a small portion of the minimal explanations are also minimum, and that the minimum explanations are significantly less verbose, hence motivating the aim of this work.


Towards Preserving Semantic Structure in Argumentative Multi-Agent via Abstract Interpretation

Patil, Minal Suresh

arXiv.org Artificial Intelligence

Over the recent twenty years, argumentation has received considerable attention in the fields of knowledge representation, reasoning, and multi-agent systems. However, argumentation in dynamic multi-agent systems encounters the problem of significant arguments generated by agents, which comes at the expense of representational complexity and computational cost. In this work, we aim to investigate the notion of abstraction from the model-checking perspective, where several arguments are trying to defend the same position from various points of view, thereby reducing the size of the argumentation framework whilst preserving the semantic flow structure in the system.


Proceedings of the 2nd Workshop on Logic and Practice of Programming (LPOP)

Warren, David S., Van Roy, Peter, Liu, Yanhong A.

arXiv.org Artificial Intelligence

This proceedings contains abstracts and position papers for the work presented at the second Logic and Practice of Programming (LPOP) Workshop. The workshop was held online, virtually in place of Chicago, USA, on November 15, 2010, in conjunction with the ACM SIGPLAN Conference on Systems, Programming, Languages, and Applications: Software for Humanity (SPLASH) 2020. The purpose of this workshop is to be a bridge between different areas of computer science that use logic as a practical tool. We take advantage of the common language of formal logic to exchange ideas between these different areas.